[Android CoreCLR] Log managed callstacks on native crash #123824

mdh1418 · 2026-01-30T23:27:17Z

Android CoreCLR currently does not ship with CreateDump. When native crashes occur, it is difficult to figure out the underlying cause of the crash, as managed stacks aren't captured anywhere, and android's crash reporter doesn't understand runtime symbols when it generates a tombstone.

Until we have a solution for creating a dump on Android CoreCLR, this PR looks to log managed callstacks for the crashing thread whenever a dump would have been created. That way, there is some hint as to what may have caused the crash, despite not having the native callstack.

These managed callstacks will only be emitted during FailFast. Synchronous faults currently will not hit PROCCreateCrashDumpIfEnabled as Android registers default signal handlers that the runtime defer signal handling to. See #123735

Example ADB log

01-30 17:08:40.422 25992 26031 I DOTNET  : Hello, Android!
01-30 17:08:40.436 25992 26031 F DOTNET  : =================================================================
01-30 17:08:40.437 25992 26031 F DOTNET  : 	Managed Stacktrace:
01-30 17:08:40.437 25992 26031 F DOTNET  : =================================================================
01-30 17:08:40.452 25992 26031 E DOTNET  :    at System.Runtime.EH.DispatchEx(System.Runtime.StackFrameIterator ByRef, ExInfo ByRef)
01-30 17:08:40.453 25992 26031 E DOTNET  :    at Program.ThrowCatchExceptions(Int32)
01-30 17:08:40.453 25992 26031 E DOTNET  :    at Program.Main(System.String[])
01-30 17:08:40.453 25992 26031 F DOTNET  : =================================================================
01-30 17:08:40.453 25992 26031 F DOTNET  : Aborting process.

dotnet-policy-service · 2026-01-30T23:27:52Z

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

This PR enables Android CoreCLR to log managed stack traces when a native crash occurs, providing diagnostic information in environments where CreateDump is not available. The logging is wired through the existing crash-dump hook so that, whenever a dump would be created, the crashing managed thread’s stack is emitted to Android’s logging.

Changes:

Add a HOST_ANDROID helper in EEPolicy that walks and logs the current managed thread’s call stack using existing LogCallstackForLogWorker infrastructure.
Extend PROCCreateCrashDumpIfEnabled on Android to call this helper via a weak symbol and emit a formatted “Managed Stacktrace” section to minipal_log_write_fatal, followed by the existing abort message.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`src/coreclr/vm/eepolicy.cpp`	Introduces `LogCallstackForAndroidNativeCrash`, an `extern "C"` Android-only helper that obtains the current `Thread` and logs its managed call stack, to be invoked from the PAL crash-dump path.
`src/coreclr/pal/src/thread/process.cpp`	On Android, adds a weak reference to `LogCallstackForAndroidNativeCrash` and, when present, logs a managed stacktrace banner and the managed stack in `PROCCreateCrashDumpIfEnabled` before logging that the process is aborting.

src/coreclr/pal/src/thread/process.cpp

mdh1418 · 2026-01-30T23:33:53Z

src/coreclr/pal/src/thread/process.cpp

+        minipal_log_write_fatal("\n=================================================================\n");
+		minipal_log_write_fatal("\tManaged Stacktrace:\n");
+		minipal_log_write_fatal("=================================================================\n");
+        LogCallstackForAndroidNativeCrash();
+        minipal_log_write_fatal("=================================================================\n");


@lateralusX I briefly looked into emitting the callstacks in the same manner that CrashInfo.cs does it. It seemed like its format was designed for managed exceptions, and I don't know if it'd be useful to stub in fields like the type, hr, message, etc. until we know what tools reading the adb log would expect.

If we did want to format it a particular way, it looed liked we'd be adding another API to CallStackLogger, for what seems like a temporary solution until Android CoreCLR can create a dump.

You do not have to stub anything in the format produced by CrashInfo.cs.

We have intentionally used json for this format. json makes it straightforward to omit fields that are not relevant or available in the given situation, and also add additional information specific for the given situation. We have number of optional fields like that, for example

runtime/src/coreclr/nativeaot/System.Private.CoreLib/src/System/CrashInfo.cs

Lines 249 to 253 in 3df8fb2

if (moduleBase != nint.Zero)

{

if (!WriteHexValue("module"u8, (nuint)moduleBase))

return false;

}

We need to, or we dont need to adopt the same format?

We were considering using the same format since NativeAOT uses it in some scenarios. If we end up generating a dump on Android CoreCLR, it would be nice if it could automatically work with !crashinfo

We need to, or we dont need to adopt the same format?

Sorry, I meant to "We do not nave to stub..."

We were considering using the same format since NativeAOT uses it in some scenarios.

Yes, it would be great to have on json format for crash reports.

Is it possible to store this json file in some directory that the apps have write access to by default? Is there a directory like that on Android?

I do not think we want to be printing it to Android console (by default) since it tends to be many pages of text.

I think we should generate a crash report in the same json format as used by NAOT and crashdumps --crashreport and --crashreportonly. It needs to run in-proc, so the data we put into the crash report needs to be retrieved on best effort and implementation should be aware that it can run from a signal handler. It should follow the same env as we do on other platforms (but only supporting the arguments making sense for crash reports).

Question is if shouldn't be on by default? Reason why it shouldn't be on by default is that the app needs additional service to get the file of device and if nothing like that is used, we could fill up the devices with crash reports and app users needs to explicitly clean them up on an app-by-app basis or by the app on relaunch. If an app has additional infrastructure/services to upload managed crash reports, then they would enable crash reports support.

Each app can have private files, it can have a private cache folder (nuked by the system), it can use external storage as well as using shared storage with other apps. An app can't write into the crash folders used by the system crash daemon.

I think that an app should opt-in to managed crash reporting, give runtime the path to the location of the crash report + additional configuration through existing env crash dump env variable.

we could fill up the devices with crash reports and app users needs to explicitly clean them up on an app-by-app basis or by the app on relaunch.

If is on by default, we need to make sure to store the last crash reports only (or only a few crash reports). Overwrite or delete older crash reports as new ones are created.

yes, if we decide to enable full crash report by default on Android we need to constrain amount of crash reports, since it will be in text format we could also consider storing it in compressed format to reduce disk footprint even more.

jkotas · 2026-01-31T01:13:00Z

src/coreclr/pal/src/thread/process.cpp

-    // TODO: Dump all managed threads callstacks into logcat and/or file?
+    if (LogCallstackForAndroidNativeCrash != nullptr)
+    {
+        minipal_log_write_fatal("\n=================================================================\n");


Do we need the ======================== separators? We do not use a separators like that anywhere else in CoreCLR.

No they're not needed, I had initially just used the format that Android Mono was using

runtime/src/mono/mono/mini/mini-exceptions.c

Lines 3005 to 3011 in 2772854

g_async_safe_printf ("\n=================================================================\n");

g_async_safe_printf ("\tManaged Stacktrace:\n");

g_async_safe_printf ("=================================================================\n");

mono_walk_stack_full (print_stack_frame_signal_safe, mctx, jit_tls, mono_get_lmf (), MONO_UNWIND_LOOKUP_IL_OFFSET | MONO_UNWIND_SIGNAL_SAFE, NULL);

g_async_safe_printf ("=================================================================\n");

jkotas · 2026-01-31T01:14:19Z

What is the experience for unhandled managed exceptions or fail fasts with this fix? We should make sure that the stacktrace is logged just once.

jkotas · 2026-01-31T01:22:19Z

Example ADB log

What was the source code or runtime change for this example?

jkotas · 2026-01-31T01:24:30Z

src/coreclr/pal/src/thread/process.cpp

+    if (LogCallstackForAndroidNativeCrash != nullptr)
+    {
+        minipal_log_write_fatal("\n=================================================================\n");
+        minipal_log_write_fatal("\tManaged Stacktrace:\n");


This should something more self-explanatory than just "Managed stacktrace:". Maybe something like "Crash in native code".

I thought PROCCreateCrashDumpIfEnabled also gets hit for unhandled managed exceptions. Would it still be fine to classify that as crash in native code? Maybe also Crash report: would align more given the hope is this would be replaced by real create dump logic.

vitek-karas · 2026-01-31T18:21:27Z

Just curious - would we also consider printing out full mixed callstack (native/managed)? The OS stackwalker will only be able to walk the first native section of the stack, once it hits managed code it will stop. So if there's native code above the managed code which would be interesting, we will have no way to learn about it. We could possibly print out just native addresses for the native portion, and rely on symbolication later.

mdh1418 · 2026-01-31T20:05:33Z

What was the source code or runtime change for this example?

I'm modifying the Android.Device_Emulator.JIT.Test Functional test as that was the easiest way to apply runtime changes.
I've been using the following forced native crashes:

FailFast via RaiseFailFastException placed in eventpipe and using the in-proc native EventListener
Segfault via *(volatile int*)0 = 0x1234; in eventpipe and using the in-proc native EventListener
Segfault via P/Invoke memset(nuull, 0, 1); in managed
Unhandled managed exception

That particular log in the description is the RaiseFailFastException in ep_buffer_write_event.

This log is from unhandled managed exception (haven't yet changed the format yet from discussions in above threads)

01-31 14:45:41.891  1346  1372 E DOTNET  : Unhandled exception. System.Exception: MIHW
01-31 14:45:41.891  1346  1372 E DOTNET  :    at Program.ForceNativeSegv()
01-31 14:45:41.891  1346  1372 E DOTNET  :    at Program.Main(String[] args)
01-31 14:45:41.892  1346  1372 F DOTNET  : =================================================================
01-31 14:45:41.892  1346  1372 F DOTNET  : 	Managed Stacktrace:
01-31 14:45:41.892  1346  1372 F DOTNET  : =================================================================
01-31 14:45:41.893  1346  1372 E DOTNET  :    at Program.ForceNativeSegv()
01-31 14:45:41.893  1346  1372 E DOTNET  :    at Program.Main(System.String[])
01-31 14:45:41.893  1346  1372 F DOTNET  : =================================================================
01-31 14:45:41.893  1346  1372 F DOTNET  : Aborting process.

jkotas · 2026-02-02T02:56:57Z

Here are my thoughts on this:

For unhandled exceptions and fail fasts, the default experience for what you see in the Android console should match what you see on stderr on Windows or Linux. The error message and stacktrace (printed exactly once).
For crashes in unmanaged code, the managed stacktrace is better than nothing but it will rarely help with diagnosing the problem. For example, I think it is impossible to connect the dots from the managed stacktrace to the crash in eventpipe in the test you have shared. I expect the Android thumbstone to be a lot more useful for these, yes - it needs to be symbolicated offline. I am not sure whether it is a good idea to print the managed stacktrace to the console in this case since I will likely be misleading most of the time. If we want to do it anyway, it needs to come with a good description of what it is to avoid confusion.

lateralusX · 2026-02-02T09:24:13Z

Just curious - would we also consider printing out full mixed callstack (native/managed)? The OS stackwalker will only be able to walk the first native section of the stack, once it hits managed code it will stop. So if there's native code above the managed code which would be interesting, we will have no way to learn about it. We could possibly print out just native addresses for the native portion, and rely on symbolication later.

This would be my next suggestion. extend our current stack walker to handler mixed native/managed frames and output symbolicated managed frames, but native frames will probably follow the same format as we get in the tombstone, address + (module + build id) in text. I think this will be beneficial, not just for Android, and it will make it simpler than combining native crash data from tombstone with the managed stack trace in logcat. If we include native frames in our internal stack walker, we also have some more capabilities to handle scenarios that the native stack-walker would have issues handle. On Android we could probably go one additional step on higher API level and use Androids libunwind and cover both native + java frames when stackwalking IP's falling outside dotnet managed code.

Do we all agree it would make sense to extend our runtime stack-walker to have an option to produce mixed stack traces, only used for scenarios like this so shouldn't impact other stack walking usage, unless enabled.

lateralusX · 2026-02-02T09:31:34Z

What is the experience for unhandled managed exceptions or fail fasts with this fix? We should make sure that the stacktrace is logged just once.

Agree, we should only log the stacktrace for the faulting thread once, since it will go into logcat that is a limited resource. Question is if we should do it on the call site instead of inside PROCCreateCrashDumpIfEnabled and focus that function to produce the json crash report on Android. In the scenario of unhandled managed exception and fail fast the managed stack trace is logged on call site before triggering call ending up in PROCCreateCrashDumpIfEnabled. For native crashes, that would mean logging this information as part of the signal handler, alternative is to change the logic for existing unmanaged exception and fail fast and make sure all scenarios (including signal handler) end up calling through the same function for logging the managed stack and then potentially create the crash dump when enabled. Feels like doing this change will make things a little cleaner and we centralize the logging in one function.

lateralusX · 2026-02-02T09:58:01Z

Here are my thoughts on this:

For unhandled exceptions and fail fasts, the default experience for what you see in the Android console should match what you see on stderr on Windows or Linux. The error message and stacktrace (printed exactly once).

For crashes in unmanaged code, the managed stacktrace is better than nothing but it will rarely help with diagnosing the problem. For example, I think it is impossible to connect the dots from the managed stacktrace to the crash in eventpipe in the test you have shared. I expect the Android thumbstone to be a lot more useful for these, yes - it needs to be symbolicated offline. I am not sure whether it is a good idea to print the managed stacktrace to the console in this case since I will likely be misleading most of the time. If we want to do it anyway, it needs to come with a good description of what it is to avoid confusion.

I think we should start with the managed stacktrace when hitting unhandled HW exceptions + tombstone and it should give you enough information and available in logcat or crash reports submitted to the play console. Next step is probably to do a mixed stacktraces since we will get all information in one place and handle more scenarios than what the native stack walker can handle, will stop on first managed frame. On Mono we do try to dump the native stacktrace followed by a managed stacktrace, but the native stacktrace has been disabled on Android, so Mono currently only dumps the managed stacktrace, then the Android crash daemon will generate the native crash report.

Maybe we could put a marker frame on top top frame of the managed stacktrace in case of crash in native code? Then we could do the same for transition in/out of runtime code through the stackwalk and when we have a mixed mode stackwalker, we can just replace those parts with the real native stack frames, something like this:

[External Code]
Program.CallSomeNativeCodeThatCrash()
Program.Main(System.String[])

jkotas · 2026-02-02T14:09:33Z

Do we all agree it would make sense to extend our runtime stack-walker to have an option to produce mixed stack traces

This should be a diagnostic specific stackwalker like what's in createdump today. I do not think we do not want to be teaching the runtime managed stack-walker that is used by the GC, EH, etc. about walking native frames for diagnostics.

if we should do it on the call site instead of inside PROCCreateCrashDumpIfEnabled

Yes. PROCCreateCrashDumpIfEnabled should keep doing what its name says. It should not print stacktraces to console.

[External Code]

Visual Studio stack window uses the same marked to indicate managed frames in managed libraries outside your project.

Something similar that's more unique like [Native Code] would work.

mdh1418 requested a review from lateralusX January 30, 2026 23:27

mdh1418 self-assigned this Jan 30, 2026

mdh1418 added the area-VM-coreclr label Jan 30, 2026

Copilot AI review requested due to automatic review settings January 30, 2026 23:27

mdh1418 added the os-android label Jan 30, 2026

Copilot started reviewing on behalf of mdh1418 January 30, 2026 23:28 View session

Copilot AI reviewed Jan 30, 2026

View reviewed changes

src/coreclr/pal/src/thread/process.cpp Outdated Show resolved Hide resolved

src/coreclr/pal/src/thread/process.cpp Outdated Show resolved Hide resolved

mdh1418 commented Jan 30, 2026

View reviewed changes

[Android CoreCLR] Log managed callstacks on native crash

ffb92cc

mdh1418 force-pushed the android_coreclr_native_crash_log_managed_callstack branch from 3d0912d to ffb92cc Compare January 30, 2026 23:40

jkotas reviewed Jan 31, 2026

View reviewed changes

This was referenced Jan 31, 2026

iOS.Device test WorkItemExecutions #122874

Open

XHarness package install failure on iOS due to devicectl NSPOSIXErrorDomain error 49 #123796

Open

mdh1418 mentioned this pull request Jan 31, 2026

[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler #123735

Open

simonrozsival mentioned this pull request Feb 2, 2026

[Android] CoreCLR support in .NET 11 dotnet/maui#33386

Open

23 tasks

	if (moduleBase != nint.Zero)
	{
	if (!WriteHexValue("module"u8, (nuint)moduleBase))
	return false;
	}

	g_async_safe_printf ("\n=================================================================\n");
	g_async_safe_printf ("\tManaged Stacktrace:\n");
	g_async_safe_printf ("=================================================================\n");

	mono_walk_stack_full (print_stack_frame_signal_safe, mctx, jit_tls, mono_get_lmf (), MONO_UNWIND_LOOKUP_IL_OFFSET \| MONO_UNWIND_SIGNAL_SAFE, NULL);

	g_async_safe_printf ("=================================================================\n");

[Android CoreCLR] Log managed callstacks on native crash #123824

Are you sure you want to change the base?

[Android CoreCLR] Log managed callstacks on native crash #123824

Conversation

mdh1418 commented Jan 30, 2026

Example ADB log

Uh oh!

dotnet-policy-service bot commented Jan 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jkotas Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdh1418 Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jkotas commented Jan 31, 2026

Uh oh!

jkotas commented Jan 31, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vitek-karas commented Jan 31, 2026

Uh oh!

mdh1418 commented Jan 31, 2026

Uh oh!

jkotas commented Feb 2, 2026

Uh oh!

lateralusX commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lateralusX commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lateralusX commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkotas commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jkotas Jan 31, 2026 •

edited

Loading

mdh1418 Jan 31, 2026 •

edited

Loading

lateralusX commented Feb 2, 2026 •

edited

Loading

lateralusX commented Feb 2, 2026 •

edited

Loading

lateralusX commented Feb 2, 2026 •

edited

Loading